On clustering procedures and nonparametric mixture estimation

نویسندگان

  • Stéphane Auray
  • Nicolas Klutchnikoff
  • Laurent Rouvière
چکیده

This paper deals with nonparametric estimation of conditional densities in mixture models in the case when additional covariates are available. The proposed approach consists of performing a preliminary clustering algorithm on the additional covariates to guess the mixture component of each observation. Conditional densities of the mixture model are then estimated using kernel density estimates applied separately to each cluster. We investigate the expected L1-error of the resulting estimates and derive optimal rates of convergence over classical nonparametric density classes provided the clustering method is accurate. Performances of clustering algorithms are measured by the maximal misclassification error. We obtain upper bounds of this quantity for a single linkage hierarchical clustering algorithm. Lastly, applications of the proposed method to mixture models involving electricity distribution data and simulated data are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Detection and Clustering by Partial Mixture Modeling

Clustering algorithms based upon nonparametric or semiparametric density estimation are of more theoretical interest than some of the distance-based hierarchical or ad hoc algorithmic procedures. However density estimation is subject to the curse of dimensionality so that care must be exercised. Clustering algorithms are sometimes described as biased since solutions may be highly influenced by ...

متن کامل

Bayesian Nonparametric Models

A Bayesian nonparametric model is a Bayesian model on an infinite-dimensional parameter space. The parameter space is typically chosen as the set of all possible solutions for a given learning problem. For example, in a regression problem the parameter space can be the set of continuous functions, and in a density estimation problem the space can consist of all densities. A Bayesian nonparametr...

متن کامل

Nonparametric Regression Estimation under Kernel Polynomial Model for Unstructured Data

The nonparametric estimation(NE) of kernel polynomial regression (KPR) model is a powerful tool to visually depict the effect of covariates on response variable, when there exist unstructured and heterogeneous data. In this paper we introduce KPR model that is the mixture of nonparametric regression models with bootstrap algorithm, which is considered in a heterogeneous and unstructured framewo...

متن کامل

Dirichlet Process Parsimonious Mixtures for clustering

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtur...

متن کامل

Bayesian Density Regression and Predictor-dependent Clustering

JU-HYUN PARK: Bayesian Density Regression and Predictor-Dependent Clustering. (Under the direction of Dr. David Dunson.) Mixture models are widely used in many application areas, with finite mixtures of Gaussian distributions applied routinely in clustering and density estimation. With the increasing need for a flexible model for predictor-dependent clustering and conditional density estimation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017